Security News
pnpm 10.0.0 Blocks Lifecycle Scripts by Default
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.
The unzipper npm package is a module that provides streaming APIs for unzipping .zip files. It allows for extracting the contents of zip files, parsing zip file structures, and more, all while being memory efficient and fast.
Extracting zip files
This feature allows you to extract the contents of a zip file to a specified directory. The code sample demonstrates how to read a zip file as a stream and pipe it to the unzipper's Extract method, which will extract the files to the given path.
const unzipper = require('unzipper');
const fs = require('fs');
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Extract({ path: 'output/path' }));
Parsing zip file entries
This feature allows you to parse the contents of a zip file and work with each entry individually. The code sample shows how to read entries from a zip file and handle them based on their type, either extracting files or draining directories.
const unzipper = require('unzipper');
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', function (entry) {
const fileName = entry.path;
const type = entry.type; // 'Directory' or 'File'
if (type === 'File') {
entry.pipe(fs.createWriteStream('output/path/' + fileName));
} else {
entry.autodrain();
}
});
Buffer-based extraction
This feature allows you to extract files from a zip file that is already loaded into a buffer. The code sample demonstrates how to open a zip file from a buffer and then extract the contents of the first file into another buffer.
const unzipper = require('unzipper');
unzipper.Open.buffer(buffer)
.then(function (directory) {
return directory.files[0].buffer();
})
.then(function (contentBuffer) {
// Use the contentBuffer
});
adm-zip is a JavaScript implementation for zip data compression for NodeJS. It provides functionalities similar to unzipper, such as reading and extracting zip files. Unlike unzipper, which is stream-based, adm-zip works with in-memory buffers, which can be less efficient for large files.
jszip is a library for creating, reading, and editing .zip files with JavaScript, with a focus on client-side use. It can be used in NodeJS as well. It offers a more comprehensive API for handling zip files compared to unzipper, including the ability to generate zip files, but it might not be as optimized for streaming large zip files.
yauzl is another NodeJS library for reading zip files. It focuses on low-level zip file parsing and decompression, providing a minimal API. It's designed to be more memory efficient than adm-zip by using lazy parsing, but it doesn't provide the high-level convenience methods that unzipper does.
This is an active fork and drop-in replacement of the node-unzip and addresses the following issues:
The structure of this fork is similar to the original, but uses Promises and inherit guarantees provided by node streams to ensure low memory footprint and emits finish/close events at the end of processing. The new Parser
will push any parsed entries
downstream if you pipe from it, while still supporting the legacy entry
event as well.
Breaking changes: The new Parser
will not automatically drain entries if there are no listeners or pipes in place.
Unzipper provides simple APIs similar to node-tar for parsing and extracting zip files. There are no added compiled dependencies - inflation is handled by node.js's built in zlib support.
Please note: Methods that use the Central Directory instead of parsing entire file can be found under Open
$ npm install unzipper
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Extract({ path: 'output/path' }));
Extract emits the 'close' event once the zip's contents have been fully extracted to disk. Extract
uses fstream.Writer and therefore needs need an absolute path to the destination directory. This directory will be automatically created if it doesn't already exits.
Process each zip file entry or pipe entries to another stream.
Important: If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's
contents. Otherwise the stream will halt. .autodrain()
returns an empty stream that provides error
and finish
events.
Additionally you can call .autodrain().promise()
to get the promisified version of success or failure of the autodrain.
// If you want to handle autodrain errors you can either:
entry.autodrain().catch(e => handleError);
// or
entry.autodrain().on('error' => handleError);
Here is a quick example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', function (entry) {
const fileName = entry.path;
const type = entry.type; // 'Directory' or 'File'
const size = entry.vars.uncompressedSize; // There is also compressedSize;
if (fileName === "this IS the file I'm looking for") {
entry.pipe(fs.createWriteStream('output/path'));
} else {
entry.autodrain();
}
});
If you pipe
from unzipper the downstream components will receive each entry
for further processing. This allows for clean pipelines transforming zipfiles into unzipped data.
Example using stream.Transform
:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.pipe(stream.Transform({
objectMode: true,
transform: function(entry,e,cb) {
const fileName = entry.path;
const type = entry.type; // 'Directory' or 'File'
const size = entry.vars.uncompressedSize; // There is also compressedSize;
if (fileName === "this IS the file I'm looking for") {
entry.pipe(fs.createWriteStream('output/path'))
.on('finish',cb);
} else {
entry.autodrain();
cb();
}
}
}
}));
Example using etl:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.pipe(etl.map(entry => {
if (entry.path == "this IS the file I'm looking for")
return entry
.pipe(etl.toFile('output/path'))
.promise();
else
entry.autodrain();
}))
unzipper.parseOne([regex])
is a convenience method that unzips only one file from the archive and pipes the contents down (not the entry itself). If no serch criteria is specified, the first file in the archive will be unzipped. Otherwise, each filename will be compared to the criteria and the first one to match will be unzipped and piped down. If no file matches then the the stream will end without any content.
Example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.ParseOne())
.pipe(fs.createReadStream('firstFile.txt'));
While the recommended strategy of consuming the unzipped contents is using streams, it is sometimes convenient to be able to get the full buffered contents of each file . Each entry
provides a .buffer
function that consumes the entry by buffering the contents into memory and returning a promise to the complete buffer.
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.pipe(etl.map(async entry => {
if (entry.path == "this IS the file I'm looking for") {
const content = await entry.buffer();
await fs.writeFile('output/path',content);
}
else {
entry.autodrain();
}
}))
The parser emits finish
and error
events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise-based structures.
Example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', entry => entry.autodrain())
.promise()
.then( () => console.log('done'), e => console.log('error',e));
Archives created by legacy tools usually have filenames encoded with IBM PC (Windows OEM) character set. You can decode filenames with preferred character set:
const il = require('iconv-lite');
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', function (entry) {
// if some legacy zip tool follow ZIP spec then this flag will be set
const isUnicode = entry.props.flags.isUnicode;
// decode "non-unicode" filename from OEM Cyrillic character set
const fileName = isUnicode ? entry.path : il.decode(entry.props.pathBuffer, 'cp866');
const type = entry.type; // 'Directory' or 'File'
const size = entry.vars.uncompressedSize; // There is also compressedSize;
if (fileName === "Текстовый файл.txt") {
entry.pipe(fs.createWriteStream(fileName));
} else {
entry.autodrain();
}
});
Previous methods rely on the entire zipfile being received through a pipe. The Open methods load take a different approach: load the central directory first (at the end of the zipfile) and provide the ability to pick and choose which zipfiles to extract, even extracting them in parallel. The open methods return a promise on the contents of the directory, with individual files
listed in an array. Each file element has the following methods:
stream([password])
- returns a stream of the unzipped content which can be piped to any destinationbuffer([password])
- returns a promise on the buffered content of the file)
If the file is encrypted you will have to supply a password to decrypt, otherwise you can leave blank.
Unlike adm-zip
the Open methods will never read the entire zipfile into buffer.Returns a Promise to the central directory information with methods to extract individual files. start
and end
options are used to avoid reading the whole file.
Example:
async function main() {
const directory = await unzipper.Open.file('path/to/archive.zip');
console.log('directory', d);
return new Promise( (resolve, reject) => {
directory.files[0]
.stream()
.pipe(fs.createWriteStream('firstFile'))
.on('error',reject)
.on('finish',resolve)
});
}
main();
This function will return a Promise to the central directory information from a URL point to a zipfile. Range-headers are used to avoid reading the whole file. Unzipper does not ship with a request library so you will have to provide it as the first option.
Live Example: (extracts a tiny xml file from the middle of a 500MB zipfile)
const request = require('request');
const unzipper = require('./unzip');
async function main() {
const directory = await unzipper.Open.url(request,'http://www2.census.gov/geo/tiger/TIGER2015/ZCTA5/tl_2015_us_zcta510.zip');
const file = directory.files.find(d => d.path === 'tl_2015_us_zcta510.shp.iso.xml');
const content = await file.buffer();
console.log(content.toString());
}
main();
This function takes a second parameter which can either be a string containing the url
to request, or an options
object to invoke the supplied request
library with. This can be used when other request options are required, such as custom headers or authentication to a third party service.
const request = require('google-oauth-jwt').requestWithJWT();
const googleStorageOptions = {
url: `https://www.googleapis.com/storage/v1/b/m-bucket-name/o/my-object-name`,
qs: { alt: 'media' },
jwt: {
email: google.storage.credentials.client_email,
key: google.storage.credentials.private_key,
scopes: ['https://www.googleapis.com/auth/devstorage.read_only']
}
});
async function getFile(req, res, next) {
const directory = await unzipper.Open.url(request, googleStorageOptions);
const file = zip.files.find((file) => file.path === 'my-filename');
return file.stream().pipe(res);
});
This function will return a Promise to the central directory information from a zipfile on S3. Range-headers are used to avoid reading the whole file. Unzipper does not ship with with the aws-sdk so you have to provide an instantiated client as first arguments. The params object requires Bucket
and Key
to fetch the correct file.
Example:
const unzipper = require('./unzip');
const AWS = require('aws-sdk');
const s3Client = AWS.S3(config);
async function main() {
const directory = await unzipper.Open.s3(s3Client,{Bucket: 'unzipper', Key: 'archive.zip'});
return new Promise( (resolve, reject) => {
directory.files[0]
.stream()
.pipe(fs.createWriteStream('firstFile'))
.on('error',reject)
.on('finish',resolve)
});
}
main();
If you already have the zip file in-memory as a buffer, you can open the contents directly.
Example:
// never use readFileSync - only used here to simplify the example
const buffer = fs.readFileSync('path/to/arhive.zip');
async function main() {
const directory = await unzipper.Open.buffer(buffer);
console.log('directory',directory);
// ...
}
main();
See LICENCE
FAQs
Unzip cross-platform streaming API
We found that unzipper demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.
Product
Socket now supports uv.lock files to ensure consistent, secure dependency resolution for Python projects and enhance supply chain security.
Research
Security News
Socket researchers have discovered multiple malicious npm packages targeting Solana private keys, abusing Gmail to exfiltrate the data and drain Solana wallets.